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1. INTRODUCTION 

If television pictures are radically to improve 
with the advent of HDTV then it is arguable that so 
must the sound presented to the domestic audience. 
But in what ways can the sound be improved? What 
is wrong with the sound systems cunently being 
offered to the public with terrestrial television, 
particularly those services offering stereo by means of 
a digital distribution system such as NIC AM 728^? 
This Report will attempt to answer some of these 
questions and will examine the implications of the 
proposed improvements. A companion Report' gives 
further information on the results of programme 
production work. 



2. SERVICE IMPROVEMENTS 

Whilst acknowledging that digital stereo sound 
is most enjoyable, it is easily perceived that the precise 
presentation of the sound images and the sound 
distribution across the sound stage vary with the 
listeners' positions. Even assuming that the listeners 
wish to adjust their rooms and equipment for 
optimum reproduction, no two people in the same 
room will hear exactly the same reproduced sounds; 
their perception of the locations of sound images from 
a stereo system will vary with their location relative to 
the positions of the loudspeakers. It is because of such 
variations, that the cinema industry has long been 
using multichannel reproduction systems. This allows 
the programme producers to exploit the directional 
cues in their sound balancing for dramatic and artistic 
reasons. 
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Fig. 1 - Recommended listening area. 
[EBU] 



This is seen as the first major requirement of 
sound with HDTV — that the system should be able 
to provide, for listener and programme maker alike, a 
belter and more controlled listening experience. The 
EBU like many organisations has attempted to specify 
such improvements^ and has produced a recommenda- 
tion showing a defined listening area (see Fig. 1), 
anywhere in which the listening experience should be 
totally predictable. This area for domestic listening, is 
between one and one and a half times the width of 
the screen and extends from two and a half times the 
picture height (H) to four and a half times the picture 
height. 

This obviously has consequences on the 
number of loudspeakers needed to recreate the 
controlled sound events and also on the number of 
transmission channels required to drive those 
loudspeakers. In this context some of the pioneering 
work of the seventies should be borne in mind, even 
though that work was for sound-only systems. 

One particular study of relevance* examined 
the acuity of the ear to directional cues. (The term 
'ear' is used here to cover the ear/brain combination.) 
Fig. 2 shows the atxuracy with which a group of 
subjects were able to detect the direction of a sound, 
whose source they could not see, from ail directions in 
an an echoic chamber. Though somewhat less accurate 
for those directions out of the frontal quadrant, 
nowhere could the resuhs be said to be poor. Fig. 3 
shows the results for the co-positioning of two sound 
sources under otherwise similar test conditions. These 
results are even more accurate. 

However, if one looks at the generation of 
phantom images between pairs of loudspeakers 
arranged around a bstener in a square format (Fig. 4), 
very variable and uneven results are produced. Whilst 
front and back quadrant results are rather similar and 
follow an approximate sine/cosine law (cf. stereo 
imaging), the side quadrant images show a strong 
tendency to be drawn to the front of the listener and 
are very unstable with head movement. There is also a 
noticeably diffuse quality to the side images making it 
less certain just where they are located. Thus, the use 
of surround sound, as a way of giving precise 
directional cues to the listener from all directions, is 
not only unattainable, if the number of loudspeakers is 
limited, but the necessity and desirability of achieving 
such an aim is unproven. 

In the domestic environment there is seldom a 
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Fig. 2 - Absolute sound localisation in the free field room. 



Fig. 3 - Relative sound localisation in the free field room. 
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Fig. 4 - Interchannel level-difference versus image location for adjacent pairs of 
loudspeakers, in a free field room with a reflecting floor. 
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household where people listen/view only one at a 
time. It is much more common and probably the 
norm for famihes, to view collectively, at least for 
some of the time. The BBC's experimental Television 
Stereo work, now amounting to the production of 
many hundreds of programmes, has shown that whilst 
aural and visual directional cues can be made to 
match for a single viewer/ listener, it is much more 
difficult to achieve such a result, even for just two 
people sitting side by side. If, therefore, those 
directional cues are to be considered a vaUd and 
usable part of the producer's armoury in creating an 
element of a programme, he has to be sure of where 
the sounds will be reproduced for his audience. It 
could be argued, in the context of HDTV, that if the 
picture is both larger and more detailed, then the 
importance of this factor is increased. 

This point has been addressed in a recent 
CCIR submission from the Federal Republic of 
Germany*. Fig. 5 shows the relevant results demonstra- 
ting clearly that an increase in the number of 
loudspeakers (and channels) used to reproduce a 
frontal sound stage, will increase the usable listening 
area for a given accuracy* of sound reproduction. The 
move from two channels to three increases the width 
of the hstening area by a factor of about three, whilst 
the use of four channels increases it by a further factor 
of four. 

Another matter that should be borne in mind 
is that the eye is more powerful than the ear. If there 
is doubt as to where a sound is coming from the brain 
places more emphasis on visual cues than on aural 
cues. Recent work by the Japanese broadcasters 
NHK^ has examined this in some detail. Fig. 6 shows 
both the experimental set-up and the results for this 
work. Specifically, they found that for a picture of a 
talking person, with the sound mislocated by 
10 degrees, listeners were easily able to detect the 
error, even if they were not disturbed by the 
mislocation. However, an error of 15 degrees or more 
started to cause some annoyance to the audience. Such 
an error is easily created by off-centre listening. 

What, then, is the optimum choice of sound 
system for HDTV? In particular, how many channels 
are needed, and how should they be reproduced? 
Although the Japanese have also addressed this 
question*, it is one that is still being explored by many 
other workers in the field. The Japanese results can be 
summarized by Fig. 7. This shows seven different 
loudspeaker arrangements and the results of evaluations 
on them, both for realism and for stability and 
accuracy of sound location for an off-centre listener. 

V IS the difference between the intended and achieved directions 
of a sound image expressed as a percentage of the angular sound 
stage width. 
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Fig. 5 - Listening areas for multichannel stereophony. 

(IRT] 

Interestingly, it is the two surround systems with a 
central loudspeaker between the stereo pair that score 
well on both counts. As far as directional cues are 
concerned the 3-channel frontal system also scores 
highly. This work, however, only used one musical 
item for its test material and thus much more work is 
required before such results could be claimed to apply 
to the whole gamut of television programmes. It is 
also notable that the NHK studies did not report any 
assessments of systems with more than a single 
surround channel, despite other workers seriously 
considering two or even four surround channels. Their 
work is, however, a useful pointer for others. 

The IRT' have reported some preliminary 
studies on the number of surround channels that will 
be required for HDTV. Their work sought a 
judgement of preference from their listeners when 
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listening to recordings of applause, atmosphere and 
music for 1-channel, 2-channel and 4-channel surround 
Teproduction. Their test arrangement is shown in 
Fig. 8 whilst the results are shown in Fig. 9. Starting 
with a 4-channel source, either two 3 dB pads are 
used to cross mix the two left surround sources and 
the two right surround sources to simulate a 2-channel 
system, or a 6 dB pad is used to cross mix all four 
surround sources to simulate a 1-channel system. 
These three presentation options were then compared, 
using the CCIR seven point comparison scale. The 
results show a distinct preference for 2-channel and 
4-channel over 1-channel surround presentations, for 
both applause and atmosphere, with little to choose 
between 2-channel and 4-channel. For the recording 
of music, where the surround channels are carrying 
reverberation in the presence of higher levels of direct 



sounds from the front, a degree of spatial masking 
appears to be occurring and very little preference is 
recorded for any of the presentations. Thus, from this 
work it could be concluded that there is a distinct 
benefit in 2-channel surround over 1-channel surround, 
but little point in going beyond 2-channel surround. 

The BBC has also carried out some informal 
assessments of different forms of sound presentation 
during some of its early HDTV programme production 
activities. The material used for these tests was sports, 
tennis and football, which had been mixed for a 3/2 
(i.e. 3 front loudspeakers/2 rear loudspeakers) format 
of reproduction. The team involved in the sound 
balancing were presented with reproduction of this 
material in 3/2, 3/1, 3/0 and 2/0 formats, using fixed 
compatibility matrixing to generate the derived formats 
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Fig. 7 - Assessment of multichannel sound systems. 
[NHK] 



from the 3/2 original. They were asked to consider 
the 3/2 presentation, on which tbey had been 
working, as the best available sound system for HDTV 
and to consider 2/0 (stereo) as a low anchor. They 
were then asked to place 3/0 and 3/1 on a linear 
scale of improving quality (of spatial reproduction and 
enjoyment) relative to the two end points of 3/2 and 
2/0. 3/0 was easily placed by all four staff at about 
half way between the two extremes. 3/1 averaged 
about the same result; in discussion, it was rated better 
on some counts than 3/0 but worse on others. In 



particular, better spaciousness due to the surround 
presentation was considered beneficial, but this was 
off-set by misleading directional cues, due to the 
central rear image of the 3/1 presentation. Thus, it was 
concluded that better frontal image stability gave rise 
to a preference for three frontal loudspeakers but that 
surround sound was probably not significantly better 
unless at least two surround channels were available. 

Consequently, additional benefits to the HDTV 
listener were seen to be derived from the provision of 



(S-3) 



L1 






rjf^ 



' ° O ■' 

= A O ^ A s 

c 
faj 2/4 loudspeaker arrangement used in the test. 



R1 



-3 -2 --1 1 2 3 

I . 1 . I 



2 - channel surround 

- chonnel surround 

- channel surround 



applause 
2 - channel surround <■ 1 



H 



I 6 1 

I 1 



channel surround 
channel surround 



I — H 



II , II 

otmo 

2 - channel surround <■ — i 
channel surround <■ — i 
channel surround 



music 
I ' I ' I ' I ' I ' I ' I 



4- channel surround 
2- channel surround 
4- channel surround 

4- channel surround 
2- channel surround 
4- channel surround 

4- channel surround 
2- channel surround 
4- channel surround 



1 




I 


L 




L 


K 








\\ 






-6dR 






Sl2 




Si 1 I 




Sua 




pi 1 




o 




o 


Sri 








Sri 


^R2 








^R2 




-1 - channel surround 










L 




r? 


K 








1- 


Sli 


( 


-3dB 

S 1 — I 






Sl2 


(3 


^; 1 1 




c 


Sri 


r' 


-3dB 
^ 1 1 




Sr-1 


^R2 


VV 1 J - 


^R2 



median 
-I interquartile range 



2 - channel surround 
f7)> Surround channel cross mixing. 

Fig. 8 - Studies on number of surround sound channels. 

IIRT] 

three or more frontal channels and two or more 
channels for surround sound. As in the case of the 
increased listening area, this is following the trend set 
by the film industry, but the number of loudspeakers 
and the number of separate sound feeds should 
obviously be the subject of much more thorough tests, 
such as those currently being prepared under the 
umbrella of the Eureka 95 Consortium. 

One of the ways in which HDTV is likely to 
be delivered to the European home, and in fact the 



Fig. 9 - Preference for surround sound presentations. 
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way it is being broadcast at present in Japan, is via 
satellite. Though the footprint of each satellite service 
is carefully controlled, in order to preserve signal 
power, the signals clearly cross international borders. 
Indeed, some services are deliberately designed to be 
multinational. It was realised from the outset, then, 
that HDTV sound must have the capability of 
providing multi-language programmes. 

Finally, in the list of new service options, there 
has long been a debate on whether terrestrial television 
should make some son of compromise to its sound 
balancing, particularly in relation to the level of sound 
effects^", in order to improve the intelligibility for the 
hard-of-hearing. Whilst this same compromise 
approach could be pursued with HDTV, it would be 
far better if a dedicated sound channel could be 
provided for, say, clean dialogue specifically for the 
hard-of-hearing. This approach has been adopted by 
both the CCIR and EBU in their work, and ways in 
which such a soimd channel could be provided are 
being studied. One option currently gaining favour is 
to consider the clean dialogue channel for the hard-of- 
hearing as just another language channel. This would 
have the distinct advantage that no special receiving 
equipment would be required by the hard-of-hearing; 
a standard receiver with language options would 
suffice and would thus be somewhat cheaper than a 
special receiver. It should not be assumed however 
that clean dialogue is always available regardless of 
the programme type. It is easily shown that, even with 
lip-microphones, the commentary signal at a large 
sporting event such as football, can contain almost as 
much crowd effect as is required for the final 
programme mix. This, as can be imagined, poses real 
problems to the sound balancing engineer for the 
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normal programme, regardless of any new clean 
dialogue requirement. What is important, however, is 
to make provision in the HDTV sound specification 
for a clean dialogue channel. The extent to which it 
can then be utihsed will subsequently depend on many 
factors. 

Thus, the HDTV set-up of the fiiture may look 
hke that shown in Fig. 10. A large wide screen will 
present pictures with starthng realism. Multiple 
loudspeakers will provide a better rendering of, say, 
the orchestra than hitherto dreamt of, both by better 
location of the instruments, using three or more front 
loudspeakers, and by reproduction of the surrounding 
hall ambience on surround channels. The commentary 
will be selectable between a number of languages, the 
chosen one being automatically mixed in with the 
multichannel sound or, for those with hearing 
difficulties, a clean feed of the speech (with no effects) 
will be available in place of the normal sound. 



services? 



But will it be so simple to provide such 




Fig. 10- TheHD experience? 



3. PROGRAMME EXPERIMENTS 

In an attempt to assess the practicality of such 
developments the BBC and other programme organisa- 
tions have been carrying out experimental programme 
productions. To date, the BBC has experimented with 
sports, music, drama, ceremonial and documentary 
programmes, each of which have been recorded on 
multitrack tapes in order to facilitate subsequent post 
production mixing tests. The findings of those 
experiments can be typified by those of a joint BBC, 
IRT and WDR post-production session carried out at 



WDR's Cologne premises, on behalf of the Eureka 95 
consortium. One of the programmes used was a BBC 
recording of The Horse of the Year Show, a horse 
jumping competition for riders of all ages held in a 
large indoor arena. 

The arrangement at the stadium is shown in 
Fig. 11, where eight type MKH 416 gun microphones 
(numbered 1 to 8) were suspended at a high level 
above the arena to pick up the arena sound effects and 
a pair of cardioid microphones were used to pick up 
the sounds of one section of the audience. Inevitably, 
in a closed stadium, audience sounds and public 
address sounds were also picked up, to some extent, 
00 all microphones, but this did not adversely affect 
the results of the experiment. 
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Fig. 11 - Microphone arrangement for Horse of the Year 

Show. 

The layout of the control room is given in 
Figs, 12 and 13. It was a room of about 150 m^ 
acoustically designed on the basis of Live-End-Dead- 
End. To enable comparisons to be made between the 
different forms of sound reproduction, nine similar 
loudspeakers were arranged as shown; in this way, up 
to four surround loudspeakers could be used together 
with either two, three or four front loudspeakers. The 
formats studied included 4/4, 4/2, 3/2, 3/1 and 3/0. 
The aim was to mix a sound balance for the most 
complex presentation 4/4 and see how this would be 
affected by presentation on fewer loudspeakers, 
making minimal changes to the sound balance 
between presentations. 

The principle of the sound balance was to 
generate a fixed acoustic view of the arena, from the 
point of view of a member of the audience covered by 
the audience microphones; as already shown these 
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Fig. 12 - Production tests at WDR, Cologne. 



were in the middle of one of the longer sides of the 
area. In addition, it was required that the action in the 
arena should progress steadily from the left hand side 
of the reproduction area to the right hand side. 

For the 4/4 presentation, the optimum mix 
derived was as shown in Fig. 14. The arena 
microphones covering the far half of the action, judged 
from the stated Ustener position, were fed directly to 
individual loudspeakers as shown (2 to L, 4 to CI 
etc.). Microphones 3 and 5 were similarly fed to CI 
and Cr directly. In contrast, however, the feed from 



microphone 1 was panned half way between Side Left 
(SI) and Left (L). Microphone 7 was treated in a 
corresponding way. Finally, the audience microphones 
were fed to the rear channels S3 and S4. The end 
product was a sound balance with the arena action 
spread evenly across a wide front sound stage with the 
audience and public address effects not just behind the 
listener but all round the presentation area; a dramatic 
and enjoyable presentation. The listening area was, as 
predicted, significantly larger than that provided by 
stereo, and sound locations did not vary noticeably, 
even for large changes in listening position. 
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Fig. 13 - The Cologne mixing room. 



Fig. 14 - Optimum 4/4 presentation. 
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Having assessed the 4/4 presentation, a 
simple reduction" from 4/4 to 3/4 was attempted 
by addition of CI and Cr into C. The result, in the 
3/4 presentation, was most peculiar; a galloping 
hoTse would progress steadily from, say, the left 
extreme to the centre, it would then gallop on the spot 
for a period (even though the picture showed 
otherwise) and finally gallop from centre to right. 
Obviously, a more complex reduction algorithm was 
needed. 

The solution was to remix the sound for the 
front channels such that the contributions for, say, 
microphone 4 in loudspeakers L and C produced a 
phantom image in the same position as loudspeaker CI 
and so on. Once remixed, 3/4 gave rise to a 
subjective effect virtually indistinguishable from that of 
the 4/4 presentation. Further optimisation was then 
carried out narrowing the images created by 
microphone feeds 3 and 5. The final 3/4 presentation 
is shown in Fig. 15. 

The equivalent of this remixing is shown in 
Fig. 16 which shows that, for a reduction from 4 to 3 
front loudspeakers, a proportional cross mix of 
0.45/0.89 will produce the required effect. A further 
reduction from 3 to 2 front loudspeakers is effected by 
mixing equal proportions of C into L and R as also 
shown in the figure. (Note, sine/cosine mixing 
relationships seem to be appropriate.) 

Reduction from 3/4 to 3/2 was effected by 
using the matrix 

Sr = Sl + (S3-3dB) S2' = S2+(S4-3 dB) (1) 
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Fig. 15 - Optimum 3/4 presentation. 



Fig. J6 - Front channel reduction matrix. 

A higher level of S3 and S4 was found to give 
too loud an audience effect; hence the 3 dB reduction. 
When these surround signals were reproduced over 
just the side loudspeakers (Si and S2), the effect was 
good at the optimum location but the depth of the 
listening area was reduced. However, with a further 
3 dB reduction in level of these derived signals it was 
found that they could be fed to all four loudspeakers, 
improving the area of listening without further 
problems. 

Reduction from 3/2 to 3/1, by adding SI' and 
S2' in Equation (1) above, produced problems of a 
mono rear audience image whether the signal was fed 
to either two or four surround loudspeakers; there was 
also a sense of comb filtering of the sounds. The only 
way of reducing these effects was to introduce delays 
into the surround loudspeaker feeds of 9, 12, 22 and 
28 mS. These delays de-correlated the sound 
sufficiently to remove the stated problems but they 
failed to recreate the sense of spread of images in the 
audience; it just produced a distribution of sound 
energy lacking in realism. 

Further reduction from 3/2 to 3/0 was 
ultimately achieved by attenuating SI' and S2' by 
3 dB and adding them to L and R respectively. The 
attenuation was required to lessen the impact of the 
audience effects. This gave a good stable arena sound 
stage with an even spread of audience. 

The alternative reduction from 3/1 to 3/0 was 
less successful. If the mono surround was added, either 
to C or at —3 dB to L and R, it gave a crowd effect 
that was unnaturally biased towards the centre of the 
sound stage. If the mono surround signal was fed in 
antiphase to the L and R loudspeakers, it produced 
acceptable results on ambient sound but totally 
unacceptable results on applause; the phasing was 
obvious and wrong. 
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The findings of this and other programme 
experiments can be summarised as follows in the 
context of domestic listening areas. Three front 
loudspeakers make a significant improvement in the 
coDtrolled reproduction of the sound event. Four front 
loudspeakers have not yet been demonstrated to be 
better. Surround sound can be most enjoyable but the 
surround information needs at least two separate 
channels: mono surround seems to work on room 
ambience but it does not produce realistic effects for 
such things as crowds/audiences and outdoor sounds. 
Whether there is a need for four surround channels 
has still to be determined, although at this stage the 
benefit is felt to be small. The reduction from 
surround to frontal presentations needs careful 
consideration. In the example above a 3 dB attenuation 
of the surround sounds was needed. In other 
programmes'^, attenuations varying from dB to 
40 dB were found to be necessary. Further work will 
determine if there is an optimum. 
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4. OTHER SOUND FACTORS 



Fig. 17 - Typical distribution of voice images in Television 
Stereo. 



Though not studied in the context of The 
Horse of the Year Show, other programme studies 
have addressed the subject of multiple languages. In 
1989 the BBC recorded both the FA Cup Final and 
the Wimbledon championships in HDTV with 
multichannel sound. On both occasions some of the 
sound channels were given over to non-English 
commentary. The unique aspect of commentary is that 
whilst it is related to the main sound channels in terms 
of timing, it is not related spatially to the main 
channels when the commentator is not in the picture. 
Thus, for all sporting events, any number of languages 
can be accommodated at the expense of one 
transmission channel (possibly not even fuU bandwidth) 
per extra language. 

The same conclusion, however, does not apply 
to other types of programme. If one considers drama, 
any dialogue in the scene will be intended to trigger 
off natural room reverberafion. Thus, even though the 
actor may be in the front with the direct component 
of his/her voice in the centre channel, the reverbera- 
tion resulting from the voice will have to come from 
all channels if the sense of spatial realism is to be 
preserved. 

It should also be recognised, that much more 
interesting and realistic programmes can be made if 
the spoken word is not restricted to centre front, as it 
is in the cinema. For many years, the BBC has been 
studying the subject in the context of Television Stereo 
and it has found (see Fig. 17) that many of the spoken 
contributions to a programme can be displaced 
significantly from the centre, and indeed should be, if 



the programme is not to be flat and uninteresting. The 
same undoubtedly will apply to multichannel sound 
with HDTV, particularly if larger screens become the 
norm. Again, it is seen that the spoken word should 
not be constrained to a single channel. 

It has to be concluded, therefore, that in the 
context of multiple language broadcasting, there are 
two separate categories of programmes, one with only 
out-of-vision commentary and one with spatial inter- 
relationships between the picture and the voices. In the 
first case, each language requires only one extra sound 
channel per language; in the latter, each language 
requires a full complement of sound channels, which 
for surround sound may be an extra five channels. 

Consequently, the sum total of sound channels 
for HDTV could be five for the first language, plus 
one for the hard-of-hearing clean feed. Additional 
languages would require one extra channel each for 
commentary purposes or more channels if other 
programme types with spatial distribution were 
required. Provision of such services will not be easy. 
Even D-MAC has capacity for only eight full-quality 
channels and whilst this would be enough, on the 
above basis, for one language there is insufficient 
capacity for extra languages unless changes to the 
D-MAC sound coding were to be adopted. Indeed, 
such an approach is currently under consideration. 
Philips are proposing a Hidden Channel approach^' 
that uses a variation on sub-band coding to hide extra 
channels in the lower significant bits of existing 
channels. IRT, on the other hand, are proposing the 
use of high ratios of bit-rate reduction using sub-band 
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coding", wbich has the useful benefit of being 
common to the proposals for DAB. Which of these 
will ultimately be chosen is imknown at present, but 
bit-rate reduction of some kind will have to be used 
and careful subjective tests will have to be undertaken 
to ensure transparency of the system to broadcast 
sound signals. 

5. DOMESTIC CONSIDERATIONS 

In all the debate on sound systems — whether 
surroimd is needed, how many channels, the quahty of 
those channels etc. — it has to be remembered that 
the ultimate arbiter of acceptability will be the 
domestic consumer. He or she is much more likely to 
be swayed by programme choice and enjoyability, 
coupled with considerations of the seating arrange- 
ments, rather than by the engineering niceties of how 
many bits are required per channel. As far as the 
consumer is concerned, it is the author's belief that 
any system, to be acceptable, has to fit into present 
domestic environments with the minimum of fuss and 
change. Whilst engineers might well recommend the 
optimum or ideal listening arrangement (Fig. 18), the 
listener at home will adopt an arrangement that is less 
demanding on the environment. Non-uniform layouts 
of loudspeakers will occur with embarrassing frequency 
and people will hsten from all sorts of angles. It has to 
be remembered also, that most of the systems will be 
set up and operated by non-engineers. Thus the systems 
will only succeed if they are robust to all manner of 
misalignments. This is one of the main reasons why 
multichannel, multiple loudspeaker systems are being 
developed rather than systems which might be less 
demanding of, say, the transmission system by 
requiring fewer channels but which would need much 
more control over the listening environment. 




t ideal " 





f practical f 
Fig. 18 - Domestic circumstances will prevail! 



However, it must not be assumed that 
everyone would want to install a surround sound 
system, even if finance were not a limitation. As in the 
seventies, when quadraphony was being developed, 
there continues to be a body of opinion that surround 
sound in the home is psychologically wrong, and that 
frontal sound is all that would be acceptable. This has 
to be accepted as a valid point of view, and any 
system to be developed should aQow for such listener 
choices. It was for such reasons that compatibility 
matrixing^* was proposed. This allows listeners to pick 
and choose the optimum system for themselves, be it 
stereo, 3-channei frontal or full surround. 

But above all, it wiU be the programmes that 
will be the deciding factor in the success or otherwise 
of HDTV, If the programmes are right and do justice 
to the HD presentation, then both the picture and 
sound systems will be seen to be a necessary adjunct 
to domestic entertainment. 



6. RECOMMENDATIONS FOR FURTHER 
WORK 

Further tests are required fiiUy to justify any con- 
clusions on the number of sound channels, the number 
of loudspeakers and their distribution in the listening 
area. If at all possible, these should use optimised 
sound mixes for each presentation format, rather than 
deriving several presentations from a single mix. 

Likewise, tests will be needed on any bit-rate 
reduction scheme required for the transmission 
of surround sound broadcasts. These should explore 
the full gamut of programme types and sound 
combinations. 

The possibilities of user choice in an HDTV 
sound system should be retained, so that each 
household can install its own choice of optimimi 
soimd system. This need not necessarily lead to a 
more expensive form of radiated signal or system, as 
long as the proposed compatibility provisions are 
retained. 



7. CONCLUSIONS 

Developments relating to the sound system for 
HDTV have been presented. 

It has been shown that, for several reasons, 
better sound will result from an extension of the 
number of sound channels and loudspeakers beyond 
the conventional stereo pair. Ways in which these 
channels can be exploited and some of the constraints 
on their use have been outlined. 
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Additional services, such as multilingual 
commentary and clean dialogue for the hard-of- 
hearing, can also enhauce the consumers' enjoyment of 
HDTV programmes. The implications of these services 
and ways in which they should be provided and used, 
have been summarised. 



B. ACKNOWLEDGEMENTS 

The author would like to thank the staff of 
WDR Cologne for their contribution to the work 
referenced above. 

The results given in Figs. 6 and 7 are 
reproduced with the permission of NHK. The results 
given in Figs. 5 and 9 are reproduced with the 
permission of IRT. 



9. REFERENCES 

1. TANTON, N.E. and STONE, M.A. HDTV 
displays: subjective effects of scanning standards 
and domestic picture sizes. Proceedings of 
1988 International Broadcasting Convention, 
pp 204-211, and BBC Research Department 
Report No. BBC RD 1989/9. 

2. BBC/IBA/BREMA, 1988. 'NICAM 728: Speci- 
fication for two additional digital sound channels 
with System I television'. BBC/IBA/BREMA 
Joint Publication, August 1988. ISBN 
563 20716 7. 

3. MEARES, D. 1991. HDTV sound: programme 
production developments. BBC Research 
Department Report No. 1991/14. 

4. EBU, 1989. EBU requirements for an HDTV 
sound system. CCIR Document No. IWP 
10/12-02-E. 

5. RATLIFF, P.A. 1974. Properties of bearing 
related to quadraphonic reproduction. BBC 
Research Department Report No. BBC RD 

1974/38. 



6. Federal Republic of Germany. Suitable number 
of sound channels to accompany wideband 
HDTV. CCIR Document No. 10/267-E. 

7 KOMIYAMA, S. 1989. Subjective evaluation of 
angular displacement between picture and sound 
directions for HDTV sound systems. J. Audio 
Eng. Soc, 37, No. 4, April 1989, pp 210-214. 

8. OHGUSHI, K. et al. 1986. Subjective evaluation 
of multi-channel stereophony for HDTV. 
Proceedings of the 81st Audio Engineering 
Society Convention, November 1986. Preprint 
No. 2363. 

9. THEILE, G. 1991. HDTV sound systems: how 
many channels? Proceedings of the AES 9th 
International Conference 'Television Sound 
Today and Tomorrow', Detroit, Michigan, 
1-2 February 1991, pp 217-232. 

10. MATHERS, CD. 1991. A study of sound 
balances for the hard of hearing. BBC Research 
Department Report No. BBC RD 1991/3. 

11. THEILE, G. 1990. The natural rendering of 
sound images in broadcasting. EBU Review - 
Technical, No. 241/242, June-August 1990. 

12. MEARES, D.J. 1991. High definition sound for 
HDTV. Proceedings of the AES 9th International 
Conference 'Television Sound Today and 
Tomorrow', Detroit, Michigan, 1-2 February 
1991, pp 187-215. 

13. ten KATE, W.R., van de KERKHOF, L.M. and 
ZIJDERVELD, F.F. 1990. Digital audio carrying 
extra information. I5th International Conference 
on Acoustics, Speech and Signal Processing, 
April 1990. Preprint No. 6.A1.2. 

14. THEILE, G. et al. 1988. Low bit rate coding of 
high-quality audio signals. An introduction to the 
MASCAM system. EBU Review - TechnicaL 
No. 230, 1988. pp 158-181. 

15. See Appendix 1 in the companion Report^. 



(S-3) 



12- 



Printed by BBC RESEARCH DEPARTMENT, Kingswood Warren, Tadworlh, Surrey, KT20 6NP 



